Orlicz Spaces and Orlicz Hearts
Introduction
Orlicz spaces generalize the familiar \(L^p\) spaces. They are based on a simple idea: instead of measuring the size of a random variable \(X\) by a power \(\vert X\vert^p\), measure it by a more flexible convex growth function \(\Phi(\vert X\vert)\). That small change turns out to be remarkably expressive. It lets us distinguish tails that all look identical from the point of view of moments, separate exponential from stretched-exponential from lognormal behavior, and build spaces tuned to exactly the kind of integrability a specific problem requires.
For probability and risk theory, that flexibility can be important. The \(L^p\) scale only sees polynomial moments. A lognormal random variable lies in every \(L^p\), but it has no positive exponential moments at all; an exponential law also lies in every \(L^p\), but it sits exactly on the boundary of exponential integrability. Orlicz spaces register those differences immediately. They are therefore a natural language whenever one wants to talk not just about whether a random variable is integrable, but how integrable it is, and what sort of tail behavior lies behind that integrability.
There is also a structural reason to care. Many monetary risk measures are first developed on \(L^\infty\), where boundedness keeps the analysis tidy. But the interesting random variables in applications are often unbounded, and once one leaves \(L^\infty\) the choice of ambient space starts to matter. Orlicz spaces and, even more importantly, Orlicz hearts provide a natural extension framework: large enough to contain many unbounded variables, but still rigid enough to support useful norm, order, and duality theory.
One of the nicest features of the subject is that the definitions are elementary, and the examples are vivid. Exponential random variables, normals, lognormals, Pareto laws, and Weibull tails all land in different places depending on the Young function \(\Phi\). The resulting picture is more than technical bookkeeping: it is a tail-sensitive map of the risk landscape.
Definitions
A Young function is a convex, increasing, left-continuous function \[ \Phi:[0,\infty)\to[0,\infty] \] with \(\Phi(0)=0\), not identically zero, and \(\Phi(t)\to\infty\) as \(t\to\infty\). In most classical examples \(\Phi\) is finite-valued, but allowing the value \(\infty\) is convenient and harmless.
Given a Young function \(\Phi\), its complementary Young function \(\Psi\) is defined by the Legendre transform \[ \Psi(y)=\sup_{x\ge 0}\, xy-\Phi(x), \qquad y\ge 0. \] Then \(\Psi\) is again a Young function, and Young’s inequality holds: \[ xy \le \Phi(x)+\Psi(y), \qquad x,y\ge 0. \]
Example 1
- \(\Phi_p(t)=t^p\), \(p\ge 1\)
- \(\Phi_{\log}(t)=t\log(1+t)\)
- \(\Phi_\beta(t)=e^{t^\beta}-1\), \(0<\beta<1\)
- \(\Phi_{\exp}(t)=e^t-t-1\).
The first gives the \(L^p\) scale; the latter three measure increasingly thin tails.
Definition 1 Let \((\Omega,\mathcal F,\mathsf{P})\) be a probability space, and let \(X\) be a measurable real-valued random variable. Associated with \(\Phi\) is the modular \[ I_\Phi(X)=\mathsf{P}\,\Phi(\vert X\vert). \] The Orlicz space is \[ L^\Phi=\set{X:\exists c>0 \text{ such that } \mathsf{P}\,\Phi(\vert X\vert/c)<\infty}, \] and the Orlicz heart is \[ M^\Phi=\set{X:\forall c>0,\ \mathsf{P}\,\Phi(\vert X\vert/c)<\infty}. \]
In the definition, \(L^\Phi\) asks whether \(\vert X\vert\) becomes \(\Phi\)-integrable after some rescaling, whereas \(M^\Phi\) requires \(\Phi\)-integrability at every scale. Always, \[ M^\Phi \subseteq L^\Phi. \]
The standard norm is the Luxemburg norm \[ \|X\|_\Phi=\inf\left\{c>0:\mathsf{P}\,\Phi(\vert X\vert/c)\le 1\right\}. \]
Main Theorems
For every Young function \(\Phi\), the spaces \(L^\Phi\) and \(M^\Phi\), equipped with the Luxemburg norm, are Banach spaces.
The bounded random variables are contained in the heart: \[ L^\infty \subseteq M^\Phi \subseteq L^\Phi. \] Moreover, \(M^\Phi\) is the norm closure of \(L^\infty\) in \(L^\Phi\).
The next theorem explains when the heart and the full space agree. Recall that \(\Phi\) satisfies the \(\Delta_2\) condition if there exist \(K>0\) and \(t_0\ge 0\) such that \[ \Phi(2t)\le K\Phi(t), \qquad t\ge t_0. \]
If \(\Phi\) satisfies \(\Delta_2\), then \[ M^\Phi=L^\Phi. \] If \(\Phi\) fails \(\Delta_2\), then the heart is generally strictly smaller than the full Orlicz space.
The duality theory is easiest for the heart.
Let \(\Psi\) be the complementary Young function to \(\Phi\). Then every \(Y\in L^\Psi\) defines a continuous linear functional on \(M^\Phi\) by \[ X\mapsto \mathsf{P}(XY), \] and every continuous linear functional on \(M^\Phi\) is of this form. Thus \[ (M^\Phi)^*=L^\Psi \] under the natural pairing.
For the full space \(L^\Phi\), the same integral pairing always embeds \(L^\Psi\) into \((L^\Phi)^*\), but if \(\Phi\) fails \(\Delta_2\) then extra singular functionals may appear. This is one reason the heart is often the cleaner object in convex duality.
Examples
The simplest case is \[ \Phi_p(t)=t^p. \] Then \[ L^{\Phi_p}=M^{\Phi_p}=L^p. \] So nothing new happens here; Orlicz theory begins once the growth differs from a pure power.
At the opposite extreme, if \[ \Phi_\infty(t)= \begin{cases} 0, & 0\le t\le 1,\\ \infty, & t>1, \end{cases} \] then \[ L^{\Phi_\infty}=M^{\Phi_\infty}=L^\infty. \]
The most important nontrivial examples for tail analysis are \[ \Phi_{\exp}(t)=e^t-t-1 \qquad\text{and}\qquad \Phi_\beta(t)=e^{t^\beta}-1, \quad 0<\beta<1. \] The first asks for exponential moments; the second asks for stretched-exponential moments.
For nonnegative \(X\), the condition \(X\in L^\Phi\) means that \(\mathsf{P}\,\Phi(X/c)<\infty\) for some \(c\), and \(X\in M^\Phi\) means that the same holds for all \(c\). In the exponential case, \[ \Phi_{\exp}(t)=e^t-t-1, \] these conditions become \[ X\in L^{\Phi_{\exp}} \iff \exists a>0 \text{ such that } \mathsf{P}e^{aX}<\infty, \] and \[ X\in M^{\Phi_{\exp}} \iff \forall a>0,\ \mathsf{P}e^{aX}<\infty. \] So the full space corresponds to the existence of some positive exponential moment, and the heart corresponds to the existence of all positive exponential moments. For general real-valued \(X\), the analogous condition uses \(\vert X\vert\) and therefore controls both tails: \[ \mathsf{P}e^{a\vert X\vert}<\infty. \]
Similarly, for \[ \Phi_\beta(t)=e^{t^\beta}-1, \] membership in \(L^{\Phi_\beta}\) or \(M^{\Phi_\beta}\) measures the existence of some, or all, stretched-exponential moments \[ \mathsf{P}e^{a\vert X\vert^\beta}. \]
Examples in Terms of Random Variables
The examples are perhaps more transparent when phrased in the language of tail thickness.
Every bounded random variable lies in every Orlicz heart: \[ L^\infty \subseteq M^\Phi \] for every Young function \(\Phi\). This is immediate, since \(\vert X\vert/c\) is uniformly bounded and \(\Phi\) is finite on bounded sets in the usual examples.
Normal random variables: let \(Z\sim N(0,1)\). Then \(Z\) has moments of every order, so \(Z\in L^p\) for all \(p<\infty\). In fact, \(Z\) lies in the exponential heart: \[ Z\in M^{\Phi_{\exp}}, \] because \[ \mathsf{P}\,\Phi_{\exp}(|Z|/c) = \mathsf{P}\left(e^{|Z|/c}-|Z|/c-1\right) <\infty \qquad\text{for every } c>0. \] Thus Gaussian tails are thinner than exponential growth in the Orlicz sense.
More generally, if \(\Phi_\beta(t)=e^{t^\beta}-1\), then \(Z\in M^{\Phi_\beta}\) for every \(\beta<2\). At the boundary \(\beta=2\) one obtains a space/heart distinction, and for \(\beta>2\) the Gaussian tail is too thick for that Young function. This is a useful reminder that even within the stretched-exponential scale, the precise exponent matters.
Exponential random variables, let \(X\sim\mathrm{Exp}(\lambda)\). Then \(X\) has moments of every order, so again \(X\in L^p\) for all finite \(p\). But \[ \mathsf{P}\,\Phi_{\exp}(X/c) = \mathsf{P}\left(e^{X/c}-X/c-1\right) \] is finite only when \(1/c<\lambda\). Therefore \[ X\in L^{\Phi_{\exp}}\setminus M^{\Phi_{\exp}}. \] This is the canonical example where the heart is strictly smaller than the full space. The random variable has some positive exponential moments, but not all.
Gamma random variables are exponential times a polynomial. Hence they behave qualitatively like the exponential law: all polynomial moments are finite; the variable belongs to \(L^{\Phi_{\exp}}\) but not to \(M^{\Phi_{\exp}}\); and it belongs to \(M^{\Phi_\beta}\) for every \(0<\beta<1\). This is a useful example because it shows that the polynomial factor does not change the main tail classification at the exponential scale.
Lognormal random variables, let \(X=e^Z\) with \(Z\sim N(0,1)\). Then \(\mathsf{P}X^p<\infty\) for every \(p<\infty\), so the lognormal lies in all \(L^p\) spaces. But it has no positive exponential moments: \(\mathsf{P}e^{aX}=\infty\) for every \(a>0\) (no MGF, not determined by its moments). Therefore \(X\notin L^{\Phi_{\exp}}\). This is an important example for risk theory. The lognormal tail is too heavy for exponential Orlicz spaces, even though it has all polynomial moments. In the hierarchy of tails, it is lighter than any power tail, but heavier than any exponential tail. For slower Young functions the lognormal may re-enter the picture. For example, with \(\Phi_\beta(t)=e^{t^\beta}-1\), the answer depends on \(\beta\). For sufficiently small \(\beta\) the lognormal belongs to the heart; at a critical exponent one again gets a space/heart distinction; and for larger \(\beta\) it belongs to neither. This is a good illustration of how Orlicz spaces distinguish among thin-tailed laws that all look identical from the point of view of the \(L^p\) scale.
Pareto random variables, let \(X\) have a Pareto tail \(\mathsf{P}(X>x)\asymp x^{-\alpha}\). Then \(X\in L^p \iff p<\alpha\). Thus a Pareto law provides the standard example of a random variable with only some finite moments. Since every exponential or stretched-exponential Young function eventually dominates any polynomial, such variables lie in none of the corresponding Orlicz spaces: \(X\notin L^{\Phi_{\exp}}\), \(X\notin L^{\Phi_\beta}\) for every \(\beta>0\).
Weibull random variables. Weibull tails form a convenient one-parameter family interpolating between power-like and super-exponential behavior: \(\mathsf{P}(X>x)\approx e^{-x^a}\). If \(a<1\), the tail is heavier than exponential; if \(a=1\), it is exponential; if \(a>1\), it is lighter than exponential. The Orlicz classification is correspondingly simple. For the full exponential Young function \(\Phi_{\exp}\):
- if \(a>1\), then \(X\in M^{\Phi_{\exp}}\);
- if \(a=1\), then typically \(X\in L^{\Phi_{\exp}}\setminus M^{\Phi_{\exp}}\);
- if \(a<1\), then \(X\notin L^{\Phi_{\exp}}\).
Likewise, for \(\Phi_\beta(t)=e^{t^\beta}-1\), the boundary occurs at \(\beta=a\):
- if \(\beta<a\), then \(X\in M^{\Phi_\beta}\);
- if \(\beta=a\), then typically \(X\in L^{\Phi_\beta}\setminus M^{\Phi_\beta}\);
- if \(\beta>a\), then \(X\notin L^{\Phi_\beta}\).
The Weibull family is therefore a useful master example.
Remark (Exponential random variable revisited via the quantile function). It is helpful to realize the exponential law explicitly on the unit interval. Let \(U\sim U(0,1)\), and define \[ X=\beta(-\log(1-U)). \] Then \(X\sim \mathrm{Exp}(\beta)\). This is just the quantile-function representation of the exponential law. Now consider the exponential Young function \(\Phi_{\exp}(t)=e^t-t-1\). The Orlicz-space condition is \[ X\in L^{\Phi_{\exp}} \iff \exists c>0 \text{ such that } \int_0^1 \Phi_{\exp}(X(u)/c)\,du<\infty, \] and the heart condition is \[ X\in M^{\Phi_{\exp}} \iff \forall c>0,\ \int_0^1 \Phi_{\exp}(X(u)/c)\,du<\infty. \] Substituting the quantile form of \(X\) gives \[ \int_0^1 \Phi_{\exp}(X(u)/c)\,du = \int_0^1 \left(e^{\beta(-\log(1-u))/c}-\frac{\beta(-\log(1-u))}{c}-1\right)\,du. \] The first term simplifies to \[ e^{\beta(-\log(1-u))/c}=(1-u)^{-\beta/c}, \] so the integral becomes \[ \int_0^1 \left((1-u)^{-\beta/c}-\frac{\beta}{c}(-\log(1-u))-1\right)\,du. \]
Now everything is visible on \((0,1)\). The only possible singularity is near \(u=1\), where \(X(u)\to\infty\). The logarithmic term is harmless: \[ \int_0^1 -\log(1-u)\,du<\infty. \] So the decisive term is (\(v=1-u\)) \[ \int_0^1 (1-u)^{-\beta/c}\,du = \int_0^1 v^{-\beta/c}\,dv, \] which is finite exactly when \[ \frac{\beta}{c}<1, \qquad\text{that is,}\qquad c>\beta. \] Therefore the Orlicz integral is finite for any \(c>\beta\), but not for all \(c\). Hence \[ X\in L^{\Phi_{\exp}}\setminus M^{\Phi_{\exp}}. \] The exponential random variable corresponds on \((0,1)\) to the logarithmic blow-up \[ X(u)=\beta(-\log(1-u)) \] at the endpoint \(u=1\). Applying the Young function produces the power singularity \[ (1-u)^{-\beta/c}, \] and the threshold between the full Orlicz space and the heart is exactly the threshold for integrability of that power at \(u=1\).
The same calculation also shows the moment-generating interpretation. Since \[ \mathsf{P}e^{aX} = \int_0^1 e^{aX(u)}\,du = \int_0^1 (1-u)^{-a\beta}\,du, \] we have \(\mathsf{P}e^{aX}<\infty\iff a\beta<1\). Some positive exponential moments exist, but not all. That is exactly why \(X\) belongs to the Orlicz space but not to the Orlicz heart for \(\Phi_{\exp}\).
For the slower Young function \(\Phi_\beta(t)=e^{t^\beta}-1\) with \(0<\beta<1\), the exponential tail is thin enough that \(X\in M^{\Phi_\beta}\). Thus the same random variable may lie in the heart for a slower Young function, while lying only in the full space for a faster one.
The exponential law already shows that the heart may be strictly smaller than the space. A second useful example is any gamma law. More generally, any random variable whose tail is asymptotically exponential sits at this boundary: some exponential moments exist, but not all. The same phenomenon occurs for Gaussian random variables with the faster Young function \(\Phi_2(t)=e^{t^2}-1\). If \(Z\sim N(0,1)\), then \(Z\in L^{\Phi_2}\setminus M^{\Phi_2}\).Thus the distinction between heart and space is not tied to any special distribution; it reflects a mismatch between tail thickness and the precise growth rate built into the Young function.
The heart can contain unbounded random variables, it is not merely a bounded-function space in disguise. For the exponential Young function \(\Phi_{\exp}\), every normal random variable lies in the heart, and of course it is unbounded. Likewise, if \(X\) has a Weibull tail with shape parameter \(a>1\), then \(X\) is unbounded but belongs to \(M^{\Phi_{\exp}}\). Thus \[ L^\infty \subsetneq M^{\Phi_{\exp}} \] on any rich probability space. The heart contains many unbounded variables; it simply excludes those whose tails are too thick relative to the prescribed growth function.
Summary by Tail Thickness
The main heuristic may be summarized as follows. A Young function defines a growth scale, and the membership of a random variable is determined by comparing that growth scale to the decay of its tails.
For the exponential Young function \(\Phi_{\exp}(t)=e^t-t-1\):
- bounded, normal, and super-exponential Weibull variables lie in the heart;
- exponential and gamma variables lie in the full space but not in the heart;
- lognormal, stretched-exponential Weibull with shape \(a<1\), and Pareto variables lie in neither.
For slower Young functions such as \(\Phi_\beta(t)=e^{t^\beta}-1\) with \(0<\beta<1\), more variables move into the heart; for faster Young functions, more variables leave the space altogether.
This perspective makes clear why Orlicz spaces are useful in the background of monetary risk measures. The \(L^p\) scale distinguishes only polynomial moments, whereas many questions in risk theory depend more sensitively on the exact rate of tail decay.
Duality
The duality theory in Orlicz spaces is straightforward, until it isn’t. For the Orlicz heart \(M^\Phi\), everything works nicely: the continuous dual is the complementary Orlicz space \(L^\Psi\), and every continuous linear functional has the honest integral form \[ X \mapsto \mathsf{P}(XY). \] That is the civilized part of the theory. It is the analogue of the familiar \(L^p\)–\(L^q\) pairing, and it is the main reason hearts are so attractive in convex analysis and risk theory.
The full Orlicz space \(L^\Phi\) is more treacherous. If \(\Phi\) satisfies the \(\Delta_2\) condition, then the heart and the space coincide, so the same clean duality survives. But when \(\Phi\) grows too fast and \(\Delta_2\) fails, the heart becomes strictly smaller than the space, and the Banach dual of \(L^\Phi\) acquires extra linear functionals that are not given by integration against any \(Y\in L^\Psi\). Those are the singular functionals. They vanish on the heart, so they are invisible on the well-behaved part of the space, but they act nontrivially on the additional elements of \(L^\Phi\) that lie beyond the heart.
That is the first genuinely unsettling point: once \(M^\Phi \neq L^\Phi\), the quotient \(L^\Phi/M^\Phi\) is nontrivial, and Hahn-Banach manufactures dual objects living entirely on that quotient. They detect the part of the space that ordinary countably additive integration cannot see. In spirit, these are close cousins of the purely finitely additive functionals in \(ba=(L^\infty)^*\). The same pathology reappears in a new costume: the Köthe dual remains the countably additive, order-continuous part, but the full Banach dual is larger and contains singular witnesses supported on the gap between heart and space.
So the duality slogan is sharp. On the heart, duality is honest. On the full space, duality may be haunted. The extra ghosts appear exactly when the growth function is fast enough to pull \(L^\Phi\) apart from its heart, and they are the Orlicz-space analogue of the old finitely additive weirdness that already lurks behind \(L^\infty\).
Futher Reading
See the books Krasnosel’skii and Rtuickii (1961), Rao and Ren (1991) and Musielak (2006) for the general theory, and the papers Cheridito and Li (2008), Cheridito and Li (2009), Arai (2009), Delbaen (2010), Gao et al. (2018) and Gao, Leung, and Xanthos (2019) for applications to monetary risk measures.
